Search CORE

15 research outputs found

Sequence determinants in human polyadenylation site selection

Author: A Moreira
C Burge
D Gautheret
D Zarkower
DF Colgan
E Beaudoing
E Beaudoing
F Chen
G Edwalds-Gilbert
G Pesole
J Zhao
JE Tabaska
N Proudfoot
RV Davuluri
S Brackenridge
Y Aissouni
ZF Chou
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: Differential polyadenylation is a widespread mechanism in higher eukaryotes producing mRNAs with different 3' ends in different contexts. This involves several alternative polyadenylation sites in the 3' UTR, each with its specific strength. Here, we analyze the vicinity of human polyadenylation signals in search of patterns that would help discriminate strong and weak polyadenylation sites, or true sites from randomly occurring signals. RESULTS: We used human genomic sequences to retrieve the region downstream of polyadenylation signals, usually absent from cDNA or mRNA databases. Analyzing 4956 EST-validated polyadenylation sites and their -300/+300 nt flanking regions, we clearly visualized the upstream (USE) and downstream (DSE) sequence elements, both characterized by U-rich (not GU-rich) segments. The presence of a USE and a DSE is the main feature distinguishing true polyadenylation sites from randomly occurring A(A/U)UAAA hexamers. While USEs are indifferently associated with strong and weak poly(A) sites, DSEs are more conspicuous near strong poly(A) sites. We then used the region encompassing the hexamer and DSE as a training set for poly(A) site identification by the ERPIN program and achieved a prediction specificity of 69 to 85% for a sensitivity of 56%. CONCLUSION: The availability of complete genomes and large EST sequence databases now permit large-scale observation of polyadenylation sites. Both U-rich sequences flanking both sides of poly(A) signals contribute to the definition of "true" sites. However, the downstream U-rich sequences may also play an enhancing role. Based on this information, poly(A) site prediction accuracy was moderately but consistently improved compared to the best previously available algorithm

Crossref

HAL AMU

Springer - Publisher Connector

Directory of Open Access Journals

HAL-Inserm

PubMed Central

Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints

Author: AV Uzilov
B Gulko
B Knudsen
B Knudsen
B Morgenstern
D Sankoff
DH Mathews
DH Mathews
DH Mathews
DKY Chiu
DS Fields
E Rivas
G Storz
I Holmes
I Holmes
I Holmes
IL Hofacker
IL Hofacker
IL Hofacker
J Gorodkin
J Gorodkin
J Gorodkin
J Reeder
J Wuyts
J Wuyts
JE Hopcroft
JE Tabaska
JH Havgaard
M Zuker
M Zuker
MS Waterman
NR Pace
O Perriquet
PP Gardner
R Durbin
R Giegerich
R Green
R Lück
R Nussinov
RD Dowell
RD Dowell
Robin D Dowell
RR Gutell
RR Gutell
RR Gutell
S Batzoglou
S Griffiths-Jones
Sean R Eddy
SR Eddy
SV Muse
V Juan
VR Akmaev
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS: We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION: Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm – this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN – have comparable overall performance with different strengths and weaknesses

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Digital Commons@Becker

3D Profile-Based Approach to Proteome-Wide Discovery of Novel Human Chemokines

Author: A Bateman
A Gerber
A Tomczak
A Zlotnik
A Zlotnik
A Zlotnik
AA Maghazachi
Andrej Shevchenko
Aurelie Tomczak
B Rost
C Boshoff
C Gille
C Pasquier
CH Wu
CJ Sigrist
D Cozzetto
D Van Der Spoel
D Wan
David Drechsel
DT Jones
E Lindahl
EL Sonnhammer
F Cocchi
Frank Buchholz
G Magistrelli
G Wang
G Wang
HH de Jongh
I Letunic
I Poser
I Prudovsky
IW Chong
J Cheng
J Gough
J Schultz
J Wang
Jana Sontheimer
JD Bendtsen
JE Pease
JE Tabaska
JG Luz
JT Stine
K Hiller
K Ottersbach
KA Roebuck
Karim Fahmy
KY Blain
LN Kinch
M. Teresa Pisabarro
MA Marti-Renom
Marc Gentzel
MJ Betts
MJ Sippl
MJ Sippl
MJ Sippl
MT Pisabarro
O Shmueli
P Flicek
P Genin
P Horton
P Puntervoll
P Ruggiero
Paul Wrede
R Colobran
Rainer Hausdorf
RJ Nibbs
S Hunter
S Kumar
S Lata
SF Altschul
Stefanie Eichler
T Fujita
TT Murooka
U Widmer
W Humphrey
WF Van Gunsteren
Y Ueda
Z Johnson
Z Zhang
Publication venue: Public Library of Science
Publication date: 07/05/2012
Field of study

Chemokines are small secreted proteins with important roles in immune responses. They consist of a conserved three-dimensional (3D) structure, so-called IL8-like chemokine fold, which is supported by disulfide bridges characteristic of this protein family. Sequence- and profile-based computational methods have been proficient in discovering novel chemokines by making use of their sequence-conserved cysteine patterns. However, it has been recently shown that some chemokines escaped annotation by these methods due to low sequence similarity to known chemokines and to different arrangement of cysteines in sequence and in 3D. Innovative methods overcoming the limitations of current techniques may allow the discovery of new remote homologs in the still functionally uncharacterized fraction of the human genome. We report a novel computational approach for proteome-wide identification of remote homologs of the chemokine family that uses fold recognition techniques in combination with a scaffold-based automatic mapping of disulfide bonds to define a 3D profile of the chemokine protein family. By applying our methodology to all currently uncharacterized human protein sequences, we have discovered two novel proteins that, without having significant sequence similarity to known chemokines or characteristic cysteine patterns, show strong structural resemblance to known anti-HIV chemokines. Detailed computational analysis and experimental structural investigations based on mass spectrometry and circular dichroism support our structural predictions and highlight several other chemokine-like features. The results obtained support their functional annotation as putative novel chemokines and encourage further experimental characterization. The identification of remote homologs of human chemokines may provide new insights into the molecular mechanisms causing pathologies such as cancer or AIDS, and may contribute to the development of novel treatments. Besides, the genome-wide applicability of our methodology based on 3D protein family profiles may open up new possibilities for improving and accelerating protein function annotation processes

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Intronic L1 Retrotransposons and Nested Genes Cause Transcriptional Interference by Inducing Intron Retention, Exonization and Cryptic Polyadenylation

Author: A Kumar
A Macia
A Mazo
A Piehler
AF Smit
Akio Kanai
Anni Hallikma
AR Kornblihtt
AR Muotri
B Brouha
BA Cunningham
BP Callen
C Steinhoff
CF Hongay
CW Gibson
DM Church
EM Ostertag
EM Prescott
ES Lander
F Liu
G Lev-Maor
GD Swergold
GJ Faulkner
H Khan
H Vihma
HC Birnboim
HI Nakaya
I Martianov
IB Dodd
IH Greger
J Chen
J Sambrook
JA Martens
JE Tabaska
JE Wilusz
Jelena Branovets
JS Han
JS Han
JS Mattick
K Matlik
KE Shearwin
KG Becker
Kristel Kaer
LE Maquat
LP Eperon
M Naville
M Padidam
M Speek
Mart Speek
MG Reese
ML Kimberland
N Crampton
N Gal-Mark
N Yang
ND Trinklein
NJ Proudfoot
O Leupin
P Nigumann
P Yu
Pilvi Nigumann
R Assis
R Druker
R Minakami
S Chavez
S Petruk
S Petruk
SE Celniker
SH Rangwala
SJ Wheelan
SK Eszterhas
SR Goldman
SV Ustyugova
T Tchenio
TG Petrakis
V Perepelitsa-Belancio
VP Belancio
WJ Kent
Y Han
Y Zhang
Z Yang
Publication venue: Public Library of Science
Publication date: 13/10/2011
Field of study

Transcriptional interference has been recently recognized as an unexpectedly complex and mostly negative regulation of genes. Despite a relatively few studies that emerged in recent years, it has been demonstrated that a readthrough transcription derived from one gene can influence the transcription of another overlapping or nested gene. However, the molecular effects resulting from this interaction are largely unknown.Using in silico chromosome walking, we searched for prematurely terminated transcripts bearing signatures of intron retention or exonization of intronic sequence at their 3' ends upstream to human L1 retrotransposons, protein-coding and noncoding nested genes. We demonstrate that transcriptional interference induced by intronic L1s (or other repeated DNAs) and nested genes could be characterized by intron retention, forced exonization and cryptic polyadenylation. These molecular effects were revealed from the analysis of endogenous transcripts derived from different cell lines and tissues and confirmed by the expression of three minigenes in cell culture. While intron retention and exonization were comparably observed in introns upstream to L1s, forced exonization was preferentially detected in nested genes. Transcriptional interference induced by L1 or nested genes was dependent on the presence or absence of cryptic splice sites, affected the inclusion or exclusion of the upstream exon and the use of cryptic polyadenylation signals.Our results suggest that transcriptional interference induced by intronic L1s and nested genes could influence the transcription of the large number of genes in normal as well as in tumor tissues. Therefore, this type of interference could have a major impact on the regulation of the host gene expression

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Why Does the Giant Panda Eat Bamboo? A Comparative Analysis of Appetite-Reward-Related Genes among Mammals

Author: A Tanzer
BP Lewis
BP Lewis
C Jin
Chenyi Xue
CT Lee
D Smedley
DP Bartel
E Mikolajczyk
EC Pooley
ES Dierenfeld
GB Schaller
GQ Zhao
H Endo
H Endo
H Frieling
H Zhao
HR Berthoud
J Chandrashekar
J Haavik
J Vidgren
JA Roth
JE Tabaska
Jinyi Qian
JL Gittleman
K Katoh
K Rutherford
KC Miranda
Ke Jin
KM Tuohy
M Huotari
M Kozak
M Kozak
M Kozak
M Rask-Andersen
M Zuker
M. James C. Crabbe
Masami Hasegawa
MJ Salesa
N Eswar
NR Lenard
PT Mannisto
R Li
RA Wise
RC Friedman
RD Palmiter
S Fulton
S Griffiths-Jones
S Haider
SB Flagel
Takahiro Yonezawa
Vincent Laudet
Xiaoli Wu
Yang Zhong
Ying Cao
Yong Zhu
Yufang Zheng
Zhen Yang
Publication venue: Public Library of Science
Publication date: 01/01/2011
Field of study

Background: The giant panda has an interesting bamboo diet unlike the other species in the order of Carnivora. The umami taste receptor gene T1R1 has been identified as a pseudogene during its genome sequencing project and confirmed using a different giant panda sample. The estimated mutation time for this gene is about 4.2 Myr. Such mutation coincided with the giant panda’s dietary change and also reinforced its herbivorous life style. However, as this gene is preserved in herbivores such as cow and horse, we need to look for other reasons behind the giant panda’s diet switch. Methodology/Principal Findings: Since taste is part of the reward properties of food related to its energy and nutrition contents, we did a systematic analysis on those genes involved in the appetite-reward system for the giant panda. We extracted the giant panda sequence information for those genes and compared with the human sequence first and then with seven other species including chimpanzee, mouse, rat, dog, cat, horse, and cow. Orthologs in panda were further analyzed based on the coding region, Kozak consensus sequence, and potential microRNA binding of those genes. Conclusions/Significance: Our results revealed an interesting dopamine metabolic involvement in the panda’s food choice

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

University of Bedfordshire Repository

A Critical Analysis of Atoh7 (Math5) mRNA Splicing in the Developing Mouse Retina

Author: A Fedorov
A Kenneson
A Telesnitsky
AJ Berk
AJ Gentles
AJ McCullough
AP Feinberg
BA Jameson
BV Derjaguin
CB Saper
CE Holt
D Brett
DA Melton
DC Blackburn
DC Jeffares
DH Mathews
DL Turner
DM Altschuler
DS Mytelka
E Harlow
E Leygue
E Wahle
EA Cho
ED Harrington
EH Margulies
ER Oliver
FJ Livesey
Gregory S. Barsh
GS Mastick
J Cocquet
J Engelbrecht
JA Brzezinski
JA Brzezinski
JAt Brzezinski
JE Tabaska
JK Pfeiffer
JL Brincat
JN Kay
KE Baker
KJ Hertel
KJ Rhodes
KL Fox-Walsh
KN Stolting
LE Maquat
Lev Prasov
LL Wong
M Eisenstein
M Irimia
M Kitabayashi
M Zuker
MA Frohman
MK Sakharkar
N Bertrand
Nadean L. Brown
NJ Schonbrunner
NK Gray
NL Brown
NL Brown
NL Brown
O Jaillon
PG Zaphiropoulos
Q Li
QM Mitrovich
RA Rupp
RB Hufnagel
RI Dogan
RJ MacDonald
RM Mader
RN Kanadia
S Alonso
S Blackshaw
S Chooniedass-Kothari
S Kanekar
SB Baim
SM Berget
SM Berget
SM Saul
SW Roy
SW Wang
T Weissensteiner
Tom Glaser
TR Cech
TR Mercer
VA Wallace
W Henke
W Wu
WA Rees
WB Melchior Jr
X Mu
Z Wang
Z Yang
Publication venue: Public Library of Science
Publication date: 01/08/2010
Field of study

The Math5 (Atoh7) gene is transiently expressed during retinogenesis by progenitors exiting mitosis, and is essential for ganglion cell (RGC) development. Math5 contains a single exon, and its 1.7 kb mRNA encodes a 149-aa polypeptide. Mouse Math5 mutants have essentially no RGCs or optic nerves. Given the importance of this gene in retinal development, we thoroughly investigated the possibility of Math5 mRNA splicing by Northern blot, 3′RACE, RNase protection assays, and RT-PCR, using RNAs extracted from embryonic eyes and adult cerebellum, or transcribed in vitro from cDNA clones. Because Math5 mRNA contains an elevated G+C content, we used graded concentrations of betaine, an isostabilizing agent that disrupts secondary structure. Although ∼10% of cerebellar Math5 RNAs are spliced, truncating the polypeptide, our results show few, if any, spliced Math5 transcripts exist in the developing retina (<1%). Rare deleted cDNAs do arise via RT-mediated RNA template switching in vitro, and are selectively amplified during PCR. These data differ starkly from a recent study (Kanadia and Cepko 2010), which concluded that the vast majority of Math5 and other bHLH transcripts are spliced to generate noncoding RNAs. Our findings clarify the architecture of the Math5 gene and its mechanism of action. These results have implications for all members of the bHLH gene family, for any gene that is alternatively spliced, and for the interpretation of all RT-PCR experiments

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Analytical methods for inferring functional effects of single base pair substitutions in human cancers

Author: A Bardelli
A ElSharawy
A Lambert
A Majid
A Riva
A Sandelin
A Srebrow
A Tenesa
A Torkamani
A Torkamani
A Torkamani
AF Rubin
AG Knudson Jr
AJ Enright
B Gold
BC Kim
BJ Blencowe
BJ Druker
BN Chorley
BW Zanke
C Greenman
C Greenman
C Mayr
C Pettigrew
CA Haiman
CI Amos
CL Vogel
D Chasman
D GuhaThakurta
D Thierry-Mieg
DA Wheeler
DF Easton
DJ Hunter
DW Parsons
F Ciardiello
F Pagani
G Getz
G Thomas
GL Bond
GL Bond
GL Bond
I Tomlinson
IP Tomlinson
J Amberger
J Gu
J Gudmundsson
J Hull
JC Knight
JE Tabaska
JL Marks
JS Kaminker
JV Ponomarenko
JV Ponomarenko
K Chen
K Ogasawara
K Palin
KA Frazer
KB Meyer
L Breiman
L Cartegni
L Conde
L Ding
LD Wood
LR Kidd
LS Hon
LS Hon
M Ashburner
M Fedele
M Gobbi De
M Morley
ML Hastings
MP Miller
O Hallikas
P Broderick
P Radivojac
P Stephens
P Yue
P Yue
PA Futreal
PC Ng
PC Ng
PD Stenson
PD Thomas
PD Thomas
Peng Yue
PJ Mishra
R Jiang
R Karchin
RA Eeles
RD Finn
RE Steward
RJ Clifford
RJ Livingston
RR Freimuth
RS Spielman
S Griffiths-Jones
S Henikoff
S Jones
S Kobayashi
S Mazoyer
S Mooney
S Stamm
S Sunyaev
S Sunyaev
S Tuupanen
S Uitte de Willige
SC Parker
SF Kingsmore
SG Warneford
ST Sherry
T Pastinen
T Sjoblom
TCGA
TG Lugo
TJ Ley
V Matys
V Nembaware
VG Cheung
VG Cheung
VG Cheung
WF Forrest
William Lee
WR Atchley
X Wang
Y Benjamini
Y Samuels
Y Zhu
YS Lee
Z Wang
Z Wang
Zemin Zhang
Publication venue: Springer-Verlag
Publication date: 01/01/2009
Field of study

Cancer is a genetic disease that results from a variety of genomic alterations. Identification of some of these causal genetic events has enabled the development of targeted therapeutics and spurred efforts to discover the key genes that drive cancer formation. Rapidly improving sequencing and genotyping technology continues to generate increasingly large datasets that require analytical methods to identify functional alterations that deserve additional investigation. This review examines statistical and computational approaches for the identification of functional changes among sets of single-nucleotide substitutions. Frequency-based methods identify the most highly mutated genes in large-scale cancer sequencing efforts while bioinformatics approaches are effective for independent evaluation of both non-synonymous mutations and polymorphisms. We also review current knowledge and tools that can be utilized for analysis of alterations in non-protein-coding genomic sequence

Crossref

Springer - Publisher Connector

PubMed Central

Skipping of Exons by Premature Termination of Transcription and Alternative Splicing within Intron-5 of the Sheep SCF Gene: A Novel Splice Variant

Author: A Abyzov
A Bernstein
A Charlesworth
A Hachiya
A Hachiya
A Kenneson
A Sharkey
A Stamatakis
AJ Matlin
Antonietta La Terza
B Wehrle-Haller
BJ Longley
BJ Longley Jr
C Hadjiconstantouras
C Renieri
CA Chen
Carlo Renieri
CI Brannan
CT Miller
D Beraldi
D Brett
D Jones
D Posada
D Thierry-Mieg
D Toksoz
Dario Pediconi
DC Bennett
DE Williams
DH Mathews
DI Vage
DI Vage
DJ Tisdall
DJ Tisdall
DM Anderson
DM Anderson
DS Mytelka
E Beaudoing
E Huang
EC Lai
EG Hutchinson
EH Margulies
EJ Huang
F Abascal
F Kiefer
F Odronitz
F Tajima
FH Martin
G Grillo
G Imokawa
G Koscielny
G Talavera
GE Crooks
GG McGill
H Zhang
HA Meijer
HE Hoekstra
HJ Cheng
HR Widlund
HS Lu
HY Huang
I Milne
J Felsenstein
J Grabbe
J Sambrook
JA Kerns
Jack Anthony Gilbert
JC Tan
JD Thompson
JE Tabaska
JG Flanagan
JH Zhou
JH Zhou
JM Grichnik
JN Petitte
JP Huelsenbeck
K Darty
K Miyazawa
K Tamura
K Tamura
KA Hultman
KE Baker
KE Langley
KM Zsebo
L Potterton
LL Baxter
M Anisimova
M Hasegawa
M Nei
M Tachibana
M Wang
M Zuker
MA Bedell
MK Majumdar
N Eswar
N Saitou
NG Copeland
NG Jablonski
NK Gray
NV Botchkareva
O Jaillon
O Meyuhas
ON Witte
P Besmer
P Flicek
P Welker
PC Gentry
Q Pan
R Kalendar
RA Laskowski
RA Laskowski
RC Edgar
RC Friedman
RM Shull
S Dolci
S Engelen
S Fukuchi
S Griffiths-Jones
S Lev
S Levy
S Nishikawa
S Rozen
S Tavaré
S Yuzawa
SB Baim
SF Altschul
Siva Arumugam Saravanaperumal
SJ Bultman
SJ Galli
TA Hall
TC Mayer
TH Jukes
VC Broudy
WJ Kent
WK Silvers
X Jiang
Y Yoshida
YM Parsons
YM Parsons
YR Hsu
Z Zhang
Z Zhang
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

Stem cell factor (SCF) is a growth factor, essential for haemopoiesis, mast cell development and melanogenesis. In the hematopoietic microenvironment (HM), SCF is produced either as a membrane-bound (−) or soluble (+) forms. Skin expression of SCF stimulates melanocyte migration, proliferation, differentiation, and survival. We report for the first time, a novel mRNA splice variant of SCF from the skin of white merino sheep via cloning and sequencing. Reverse transcriptase (RT)-PCR and molecular prediction revealed two different cDNA products of SCF. Full-length cDNA libraries were enriched by the method of rapid amplification of cDNA ends (RACE-PCR). Nucleotide sequencing and molecular prediction revealed that the primary 1519 base pair (bp) cDNA encodes a precursor protein of 274 amino acids (aa), commonly known as ‘soluble’ isoform. In contrast, the shorter (835 and/or 725 bp) cDNA was found to be a ‘novel’ mRNA splice variant. It contains an open reading frame (ORF) corresponding to a truncated protein of 181 aa (vs 245 aa) with an unique C-terminus lacking the primary proteolytic segment (28 aa) right after the D175G site which is necessary to produce ‘soluble’ form of SCF. This alternative splice (AS) variant was explained by the complete nucleotide sequencing of splice junction covering exon 5-intron (5)-exon 6 (948 bp) with a premature termination codon (PTC) whereby exons 6 to 9/10 are skipped (Cassette Exon, CE 6–9/10). We also demonstrated that the Northern blot analysis at transcript level is mediated via an intron-5 splicing event. Our data refine the structure of SCF gene; clarify the presence (+) and/or absence (−) of primary proteolytic-cleavage site specific SCF splice variants. This work provides a basis for understanding the functional role and regulation of SCF in hair follicle melanogenesis in sheep beyond what was known in mice, humans and other mammals

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Archivio istituzionale della ricerca - Università di Camerino

FigShare